The Algorithms for FPGA Implementation of Sparse Matrices Multiplication
نویسندگان
چکیده
In comparison to dense matrices multiplication, sparse matrices multiplication real performance for CPU is roughly 5–100 times lower when expressed in GFLOPs. For sparse matrices, microprocessors spend most of the time on comparing matrices indices rather than performing floating-point multiply and add operations. For 16-bit integer operations, like indices comparisons, computational power of the FPGA significantly surpasses that of CPU. Consequently, this paper presents a novel theoretical study how matrices sparsity factor influences the indices comparison to floating-point operation workload ratio. As a result, a novel FPGAs architecture for sparse matrix-matrix multiplication is presented for which indices comparison and floating-point operations are separated. We also verified our idea in practice, and the initial implementations results are very promising. To further decrease hardware resources required by the floating-point multiplier, a reduced width multiplication is proposed in the case when IEEE-754 standard compliance is not required.
منابع مشابه
Investigating the Effects of Hardware Parameters on Power Consumptions in SPMV Algorithms on Graphics Processing Units (GPUs)
Although Sparse matrix-vector multiplication (SPMVs) algorithms are simple, they include important parts of Linear Algebra algorithms in Mathematics and Physics areas. As these algorithms can be run in parallel, Graphics Processing Units (GPUs) has been considered as one of the best candidates to run these algorithms. In the recent years, power consumption has been considered as one of the metr...
متن کاملAn FPGA Drop-In Replacement for Universal Matrix-Vector Multiplication
We present the design and implementation of a universal, single-bitstream library for accelerating matrixvector multiplication using FPGAs. Our library handles multiple matrix encodings ranging from dense to multiple sparse formats. A key novelty in our approach is the introduction of a hardware-optimized sparse matrix representation called Compressed Variable-Length Bit Vector (CVBV), which re...
متن کاملAn FPGA implementation of Walsh-Hadamard transforms for signal processing
This paper describes two approaches suitable for an FPGA implementation of Walsh-Hadamard transforms. These transforms are important in many signal-processing applications including speech compression, filtering and coding. Two novel architectures for the Fast Hadamard Transforms using both systolic architecture and distributed arithmetic techniques are presented. The first approach uses the Ba...
متن کاملAlgorithms and Data Structures for Very Large Sparse Matrices
Sparse matrix-vector multiplication (SpMV) is one of the most important numerical core methods in scientific and engineering computing. On today’s petascale and future exascale systems, the three principal problems related to SpMV are the minimization of communication between processors in distributed-memory environments, the maximization of the efficiency of local SpMV performed by a single pr...
متن کاملFast sparse matrix multiplication on GPU
Sparse matrix multiplication is an important algorithm in a wide variety of problems, including graph algorithms, simulations and linear solving to name a few. Yet, there are but a few works related to acceleration of sparse matrix multiplication on a GPU. We present a fast, novel algorithm for sparse matrix multiplication, outperforming the previous algorithm on GPU up to 3× and CPU up to 30×....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computing and Informatics
دوره 33 شماره
صفحات -
تاریخ انتشار 2014